A Non-Parametric Approach to Dynamic Programming

نویسندگان

  • Oliver Kroemer
  • Jan Peters
چکیده

In this paper, we consider the problem of policy evaluation for continuousstate systems. We present a non-parametric approach to policy evaluation, which uses kernel density estimation to represent the system. The true form of the value function for this model can be determined, and can be computed using Galerkin’s method. Furthermore, we also present a unified view of several well-known policy evaluation methods. In particular, we show that the same Galerkin method can be used to derive Least-Squares Temporal Difference learning, Kernelized Temporal Difference learning, and a discrete-state Dynamic Programming solution, as well as our proposed method. In a numerical evaluation of these algorithms, the proposed approach performed better than the other methods.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A new non-parametric approach for suppliers selection

In this paper we propose a simple non-parametric model for multiple crite-ria supplier selection problem. The proposed model does not generate a zeroweight for a certain criterion and ranks the suppliers without solving the modeln times (one linear programming (LP) for each supplier) and therefore allowsthe manager to get faster results. The methodology is illustrated using anexample.

متن کامل

A Parametric Approach for Solving Multi-Objective Linear Fractional Programming Phase

In this paper a multi - objective linear fractional programming problem with the fuzzy variables and vector of fuzzy resources is studied and an algorithm based on a parametric approach is proposed. The proposed solving procedure is based on the parametric approach to find the solution, which provides the decision maker with more complete information in line with reality. The simplicity of the ...

متن کامل

A Multi-Objective Fuzzy Approach to Closed-Loop Supply Chain Network Design with Regard to Dynamic Pricing

During the last decade, reverse logistics networks received a considerable attention due to economic importance and environmental regulations and customer awareness. Integration of leading and reverse logistics networks during logistical network design is one of the most important factors in supply chain. In this research, an Integer Linear Programming model is presented to design a multi-layer...

متن کامل

A dynamic bi-objective model for after disaster blood supply chain network design; a robust possibilistic programming approach

Health service management plays a crucial role in human life. Blood related operations are considered as one of the important components of the health services. This paper presents a bi-objective mixed integer linear programming model for dynamic location-allocation of blood facilities that integrates strategic and tactical decisions. Due to the epistemic uncertain nature of ...

متن کامل

Measuring a Dynamic Efficiency Based on MONLP Model under DEA Control

Data envelopment analysis (DEA) is a common technique in measuring the relative efficiency of a set of decision making units (DMUs) with multiple inputs and multiple outputs. ‎‎Standard DEA models are ‎‎quite limited models‎, ‎in the sense that they do not consider a DMU ‎‎at different times‎. ‎To resolve this problem‎, ‎DEA models with dynamic ‎‎structures have been proposed‎.‎In a recent pape...

متن کامل

Optimal production and marketing planning with geometric programming approach

One of the primary assumptions in most optimal pricing methods is that the production cost is a non-increasing function of lot-size. This assumption does not hold for many real-world applications since the cost of unit production may have non-increasing trend up to a certain level and then it starts to increase for many reasons such as an increase in wages, depreciation, etc. Moreover, the prod...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011